Adding a Novel Italian Treebank of Marked Constructions to Universal Dependencies
نویسندگان
چکیده
In this paper we present a novel treebank developed to analyse marked constructions in Italian called MarkIT. The resource contains almost 1,300 sentences manually annotated with dependency relations following the Universal Dependencies paradigm. have been extracted from essays written by high-school students along several years, which accounts for structure and topic variability of sentences. work, detail process select sentences, parse them automatically then correct them. covers seven types (839 overall) plus some whose syntax can be wrongly classified as serve negative examples markedness (453 sentences). We also an evaluation parsing performance, comparing model trained on existing treebanks obtained adding MarkIT training set.
منابع مشابه
A Universal Dependencies Treebank for Marathi
This paper describes the creation of a free and open-source dependency treebank for Marathi, the first open-source treebank for Marathi following the Universal Dependencies (UD) syntactic annotation scheme. In the paper, we describe some of the syntactic andmorphological phenomena in the language that required special analysis, and how they fit into the UD guidelines. We also evaluate the parsi...
متن کاملUniversal Dependencies v1: A Multilingual Treebank Collection
Cross-linguistically consistent annotation is necessary for sound comparative evaluation and cross-lingual learning experiments. It is also useful for multilingual system development and comparative linguistic studies. Universal Dependencies is an open community effort to create cross-linguistically consistent treebank annotation for many languages within a dependency-based lexicalist framework...
متن کاملThe Universal Dependencies Treebank of Spoken Slovenian
This paper presents the construction of an open-source dependency treebank of spoken Slovenian, the first syntactically annotated collection of spontaneous speech in Slovenian. The treebank has been manually annotated using the Universal Dependencies annotation scheme, a one-layer syntactic annotation scheme with a high degree of cross-modality, cross-framework and cross-language interoperabili...
متن کاملThe Universal Dependencies Treebank for Slovenian
This paper introduces the Universal Dependencies Treebank for Slovenian. We overview the existing dependency treebanks for Slovenian and then detail the conversion of the ssj200k treebank to the framework of Universal Dependencies version 2. We explain the mapping of part-of-speech categories, morphosyntactic features, and the dependency relations, focusing on the more problematic language-spec...
متن کاملSlovak Dependency Treebank in Universal Dependencies
We describe a conversion of the syntactically annotated part of the Slovak National Corpus into the annotation scheme known as Universal Dependencies. Only a small subset of the data has been converted so far; yet it is the first Slovak treebank that is publicly available for research. We list a number of research projects in which the dataset has been used so far, including the first parsing r...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
ژورنال
عنوان ژورنال: IJCoL
سال: 2023
ISSN: ['2499-4553']
DOI: https://doi.org/10.4000/ijcol.1110